Search CORE

Rothamsted Repository

CiteSeerX

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Journal of Statistical Software

MURAL - Maynooth University Research Archive Library

Introducing the GWmodel R and python packages for modelling spatial heterogeneity

Author: Brunsdon Chris
Charlton Martin
Gollini Isabella
Harris Paul
Lu Binbin
Publication venue
Publication date: 01/01/2013
Field of study

In the very early developments of quantitative geography, statistical techniques were invariably applied at a ‘global’ level, where moments or relationships were assumed constant across the study region (Fotheringham and Brunsdon, 1999). However, the world is not an “average” space but full of variations and as such, statistical techniques need to account for different forms of spatial heterogeneity or non-stationarity (Goodchild, 2004). Consequently, a number of local methods were developed, many of which model non- stationarity relationships via some regression adaptation. Examples include: the expansion method (Casetti, 1972), random coefficient modelling (Swamy et al., 1988), multilevel modelling (Duncan and Jones, 2000) and space varying parameter models (Assunção, 2003). One such localised regression, geographically weighted regression (GWR) (Brunsdon et al., 1996) has become increasingly popular and has been broadly applied in many disciplines outside of its quantitative geography roots. This includes: regional economics, urban and regional analysis, sociology and ecology. There are several toolkits available for applying GWR, such as GWR3.x (Charlton et al., 2007); GWR 4.0 (Nakaya et al., 2009); the GWR toolkit in ArcGIS (ESRI, 2009); the R packages spgwr (Bivand and Yu, 2006) and gwrr (Wheeler, 2011); and STIS (Arbor, 2010). Most focus on the fundamental functions of GWR or some specific issue - for example, gwrr provides tools to diagnose collinearity. As a major extension, we report in this paper the development an integrated framework for handling spatially varying structures, via a wide range of geographically weighted (GW) models, not just GWR. All functions are included in an R package named GWmodel, which is also mirrored with a set of GW modelling tools for ESRI’s ArcGIS written in Python

Package ‘GWmodel’

Author: Brunsdon Chris
Charlton Martin
Gollini Isabella
Harris Paul
Lu Binbin
Nakaya Tomoki
Publication venue: National University of Ireland Maynooth
Publication date: 01/01/2015
Field of study

In GWmodel, we introduce techniques from a particular branch of spatial statis- tics,termed geographically-weighted (GW) models. GW models suit situa- tions when data are not described well by some global model, but where there are spatial re- gions where a suitably localised calibration provides a better description. GWmodel in- cludes functions to calibrate: GW summary statistics, GW principal components analy- sis, GW discriminant analysis and various forms of GW regression; some of which are pro- vided in basic and robust (outlier resistant) forms

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Joint modelling of multiple network wiews

Author: Airoldi E.M.
Attias H.
Bandyopadhyay S.
Dempster A.P.
Gollini I.
Isabella Gollini
Pearson M.
Ripley R.M.
Stark C.
Stark C.
Thomas Brendan Murphy
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2013
Field of study

Latent space models (LSM) for network data were introduced by Hoff et al. (2002) under the basic assumption that each node of the network has an unknown position in a D-dimensional Euclidean latent space: generally the smaller the distance between two nodes in the latent space, the greater their probability of being connected. In this paper we propose a variational inference approach to estimate the intractable posterior of the LSM. In many cases, different network views on the same set of nodes are available. It can therefore be useful to build a model able to jointly summarise the information given by all the network views. For this purpose, we introduce the latent space joint model (LSJM) that merges the information given by multiple network views assuming that the probability of a node being connected with other nodes in each network view is explained by a unique latent variable. This model is demonstrated on the analysis of two datasets: an excerpt of 50 girls from 'Teenage Friends and Lifestyle Study' data at three time points and the Saccharomyces cerevisiae genetic and physical protein-protein interactions

CiteSeerX

Research Repository UCD

Irish Universities

MURAL - Maynooth University Research Archive Library

FigShare

Package ‘GWmodel’

Author: Brunsdon Chris
Charlton Martin
Gollini Isabella
Harris Paul
Lu Binbin
Nakaya Tomoki
Publication venue: National University of Ireland Maynooth
Publication date: 01/01/2015
Field of study

Mixture of latent trait analyzers for model-based clustering of categorical data

Author: A. Frank
A. Hadgu
A.E. Raftery
A.P. Dempster
B. Efron
B. Muthén
C. Biernacki
C. Fraley
C.M. Bishop
D. Karlis
D.J. Bartholomew
D.J. Bartholomew
D.J. Bartholomew
D.K. Pauler
E.A. Erosheva
E.A. Erosheva
E.A. Erosheva
E.S. Allman
G. Celeux
G. McLachlan
G. McLachlan
G. McLachlan
G. Rasch
G. Schwarz
Isabella Gollini
J. Baek
J. Rost
J. Rost
J. Vermunt
J. Vermunt
J. Vermunt
J.L. Andrews
J.S. Uebersax
L.A. Goodman
M. Abramowitz
M. Davier von
M. Davier von
M.D. Sammel
M.E. Tipping
N. Dean
P.D. McNicholas
P.D. McNicholas
R.D. Bock
S. Brin
S.E. Fienberg
T.I. Lin
T.I. Lin
T.S. Jaakkola
Thomas Brendan Murphy
Y. Qu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/02/2013
Field of study

Model-based clustering methods for continuous data are well established and commonly used in a wide range of applications. However, model-based clustering methods for categorical data are less standard. Latent class analysis is a commonly used method for model-based clustering of binary data and/or categorical data, but due to an assumed local independence structure there may not be a correspondence between the estimated latent classes and groups in the population of interest. The mixture of latent trait analyzers model extends latent class analysis by assuming a model for the categorical response variables that depends on both a categorical latent class and a continuous latent trait variable; the discrete latent class accommodates group structure and the continuous latent trait accommodates dependence within these groups. Fitting the mixture of latent trait analyzers model is potentially difficult because the likelihood function involves an integral that cannot be evaluated analytically. We develop a variational approach for fitting the mixture of latent trait models and this provides an efficient model fitting strategy. The mixture of latent trait analyzers model is demonstrated on the analysis of data from the National Long Term Care Survey (NLTCS) and voting in the U.S. Congress. The model is shown to yield intuitive clustering results and it gives a much better fit than either latent class analysis or latent trait analysis alone

Theory of statistics, basics, and fundamentals

Author: Gollini Isabella
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

A multilayer exponential random graph modelling approach for weighted networks

Author: Caimo Alberto
Gollini Isabella
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

A new modelling approach for the analysis of weighted networks with ordinal/polytomous dyadic values is introduced. Specifically, it is proposed to model the weighted network connectivity structure using a hierarchical multilayer exponential random graph model (ERGM) generative process where each network layer represents a different ordinal dyadic category. The network layers are assumed to be generated by an ERGM process conditional on their closest lower network layers. A crucial advantage of the proposed method is the possibility of adopting the binary network statistics specification to describe both the between-layer and across-layer network processes and thus facilitating the interpretation of the parameter estimates associated to the network effects included in the model. The Bayesian approach provides a natural way to quantify the uncertainty associated to the model parameters. From a computational point of view, an extension of the approximate exchange algorithm is proposed to sample from the doubly-intractable parameter posterior distribution. A simulation study is carried out on artificial data and applications of the methodology are illustrated on well-known datasets. Finally, a goodness-of-fit diagnostic procedure for model assessment is proposed.24 month embargo - ACUpdate issue date during checkdate report - A

Research Repository UCD

Arrow@TUDublin

Irish Universities

Bayesian computational algorithms for social network analysis

Author: Caimo A.
Gollini Isabella
Publication venue: 'Wiley'
Publication date: 13/04/2015
Field of study

Interest in statistical network analysis has grown massively in recent decades and its perspective and methods are now widely used in many scientific areas that involve the study of various types of networks for representing structure in many complex relational systems such as social relationships, information flows, and protein interactions. Social network analysis is based on the study of social relations between actors so as to understand the formation of social structures by the analysis of basic local relations. Statistical models have started to play an increasingly important role because they give the possibility to explain the complexity of social behaviour and to investigate issues on how the global features of an observed network may be related to local network structures. In this chapter, we review some of the most recent computational advances in the rapidly expanding field of statistical social network analysis using the R open-source software

Rapidly bounding the exceedance probabilities of high aggregate losses

Author: Gollini Isabella
Rougier J.
Publication venue: 'Infopro Digital Services Ltd'
Publication date: 07/07/2015
Field of study

We consider the task of assessing the righthand tail of an insurer's loss distribution for some specified period, such as a year. We present and analyse six different approaches: four upper bounds, and two approximations. We examine these approaches under a variety of conditions, using a large event loss table for US hurricanes. For its combination of tightness and computational speed, we favour the Moment bound. We also consider the appropriate size of Monte Carlo simulations, and the imposition of a cap on single event losses. We strongly favour the Gamma distribution as a flexible model for single event losses, for its tractable form in all of the methods we analyse, its generalisability, and because of the ease with which a cap on losses can be incorporated